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ON THE HOLDER CONTINUITY IN RIDGE FUNCTION 

REPRESENTATION 

RASHID A. ALIEV, AYSEL A. ASGAROVA, AND VUGAR E. ISMAILOV 


Abstract. In this paper, we show that if a multivariate function is 
locally Holder continuous of some degree and represented by a sum of 
arbitrarily behaved ridge functions, then, under a suitable condition, it 
can be also represented by a sum of ridge functions, which are locally 
Holder continuous of the same degree. 


1. Introduction 

A ridge function is a multivariate function of the form 
g (a-x) = g {aixi + ... + amXm), 

where (/ : M —)■ M, a = (ai,...,a^) is a fixed vector (direction) in M”^\{0}, 
X = (xi, is the variable and a-x is the usual inner product in M™. In the 

theory of partial differential equations, ridge functions have been known under 
the name of plane waves ( see, e.g., [14]). The term “ridge function” was devised 
by Logan and Shepp in their pioneering paper [22] dedicated to the mathematics 
of computerized tomography (see also [15, 16, 24, 25]). After a 1981 paper by 
Friedman and Stuetzle [8] ridge functions started to appear also in statistics, 
especially, in the theory of projection pursuit and projection regression (see, e.g., 
[7, 8, 9]). The general idea therein was to reduce “dimension” and thus bypass 
the “curse of dimensionality”. 

Ridge functions are used in many models in neural network theory. For exam¬ 
ple, in one of the popular models called MLP (multilayer feedforward perceptron) 
model, the simplest case considers functions of the form 

r 

^Cia{w^-x - di). 
i=l 

Here the weights w* are vectors in M™, the thresholds 9i and the coefficients Cj 
are real numbers and the activation function a is a univariate function. Note 
that for each 0 G M and w G M™\{0} the function 

cj(w • X — 0) 
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is a ridge function. For a detailed survey of approximation theory of the MLP 
model see [30]. 

Ridge functions are interesting also to approximation theorists. In approxima¬ 
tion theory, these functions are implemented as an effective and convenient tool 
for approximating complicated multivariate functions (see, e.g., [10, 11, 12, 13, 
19, 21, 23, 26, 29]). 

In this paper, we consider the problem of representation by sums of ridge 
functions with r, r > 1, fixed directions. Let the directions a* G M”^\{0}, 
i = 1, ...,r, be given and pairwise linearly independent. The first problem arising 
here is about the representability of a given multivariate function / : —)■ M as 

a sum of ridge functions with the directions a*, i = 1, ...,r. In other words, we 
want to know when the function / can be written in the form 

r 

fi^) = ( 1 - 1 ) 

i=l 

This problem has a simple solution if / depends on two variables and has partial 
derivatives up to r-th order. For the representation of f{x, y) in the following 
form 

r 

f{x,y) = 

i=l 

it is necessary and sufficient that 



Note that the last assertion is valid also for continuous functions of two variables 
provided that the derivatives are understood in the sense of distributions. It 
should be remarked that this simple assertion is not directly generalized to the 
case when / depends on more than two variables. 

Assume we know that a function /(x) can be represented in the form (1.1). 
Assume in addition that / is of the class (^^(M”^). What can we say about 
gi? Can we say that gi G C'^(M)? The case r = 1 is obvious. In this case, if 
/ G then for c G satisfying • c = 1 we have that gi{t) = f{tc) 

is in C^(M). The same argument can be carried out for the case r = 2. In this 
case, since the vectors and a^ are linearly independent, there exists a vector 
c G M™' satisfying a^ • c = 1 and a^ • c = 0. Therefore, we obtain that the 
function gi{t) = f{tc) — 32 ( 0 ) is in the class C'^(M). Similarly, one can verify that 
g 2 G CHm (see [3]). 

The above question becomes quite difficult if the number of directions r > 3. 
For r = 3, there are many smooth functions which decompose into sums of very 
badly behaved ridge functions. This is a consequence of the classical Cauchy 
Functional Equation (CFE). This equation is defined as 

h{x + y) = h{x) + h{y), /i : M —)■ M, (1-3) 

which has a class of simple solutions h{x) = cx, c G M. However, it easily follows 
from the Hamel basis theory that CEE has also a large class of badly behaved 
solutions. These solutions are called “badly behaved” because they are weird 
over reals. They are, for example, not continuous at a point, not monotone at 
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an interval, not bounded at any set of positive measure (see, e.g., [1]). Let hi be 
any such solution of the equation (1.3). Then the zero function can be written as 

Q = hi{x) + hi{y)-hi{x + y). (1.4) 

Note that the functions involved in (1.4) are ridge functions with the direc¬ 
tions (1,0), (0,1) and (1,1) respectively. This particular example shows that 
for smoothness of the representation (1.1) one must impose additional conditions 
on the functions gi, i = 1, ...,r. 

It was first proved by Buhman and Pinkus [6] that if in (1.1) / G 
k > r — 1 and gi G L;^q^(M) for each i, then g^ G C'^(M) for i = I, ...,r. In [28] 
Pinkus found a strong relationship between CFE and the problem of smoothness 
in ridge function representation. He generalized extensively the previous result of 
Buhman and Pinkus [6]. He showed that the solution is quite simple and natural 
if the functions gi are taken from a class B of real-valued functions u defined on 
M. By definition, u is in H if for any function v G C'(M) for which u — v satisfies 
CFE, u — V is linear, i.e. u{x) — v{x) = cx, where c G M (see [28]). The result 
of Pinkus states that if in (1.1) / G (^^(M™) and each g^ G B, then necessarily 
gi G C'^(M) for i = 1,..., r. 

The above representation problem was also considered by Konyagin and Kuleshov 
[17, 18] and by Kuleshov [20]. They mainly analyze the continuity of the repre¬ 
sentation, that is, the question if and when continuity of / in (1.1) guarantees the 
continuity of gi. There are also other results concerning the smoothness of ridge 
function representation generalizing the above result of Pinkus (see [20]). The 
results in [17, 18, 20] involve certain subsets (convex open sets, convex bodies, 
etc.) of instead of only itself. 

The results of Pinkus [28] gave rise to the following natural and important 
problem. Assume in the representation (1.1) / G (7^(1^”^), but the functions gi 
are arbitrarily behaved (that is, we allow very badly behaved functions). Can 
we write / as a sum but with the fi G C'^(M), i = 1, ...,r? This 

problem was posed in [6] and [27]. In [3], Aliev and Ismailov obtained a partial 
solution to this problem. Their solution comprises the cases in which k > 1 and 
r — 1 directions of given r directions are linearly independent. Note that such 
condition on directions is satisfied by default if we are given three directions, 
as it is assumed that all the directions are pairwise linearly independent. The 
representation problem in the case of three directions was initially considered 
in [5]. For bivariate functions having the degree of smoothness k > r — 2, the 
problem was completely solved in [4], 

Kuleshov [20] generalized Aliev and Ismailov’s result [3] to the other possible 
cases of k. That is, he proved that if a function / G C'^(M™'), where A: > 0, is of 
the form (1.1) and r — 1-tuple of the given set of r directions a* forms a linearly 
independent system, then in (1.1) the functions gi can be replaced with functions 
/, gC^(M), i = l,...,r (see [20, Theorem 3]). In [2], we proved this result using 
completely different ideas. Note that our proof contains a theoretical method for 
constructing the mentioned functions fi G C'^(M) (see [2, Theorem 2.1, Theorem 
2.2]). Using this method, we also estimated the modulus of continuity of fi in 
terms of the modulus of continuity of / (see [2, Remark 2]). 
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In this paper, we continue investigations on ridge function representations. 
The question considered here is as follows. Assume / is Holder continuous on 
compact subsets of and possesses the representation (1.1). Can we replace gi 
with Holder continuous functions of the same degree as /? We answer this ques¬ 
tion positively in the case when r — 1 directions of given r directions are linearly 
independent. First we prove this when the space dimension m = 2 and the num¬ 
ber of directions r = 3. Then based on the multidimensional techniques exploited 
in our previous paper [2], we formulate the main result in the general case. Note 
that the main result of this paper (see Theorem 2.2) cannot be obtained directly 
from the above mentioned results. 

2. Main result 

We start this section with the following well-known definition. 

Definition 2.1. We say that a function F : MF — )• M is locally Holder continuous 
with degree a, 0 < a < 1, if for any compact set K C there is a number M = 
M (F; a; K) > 0 such that for any x = (xi,..., Xm) £ K and y = (yi, £ K 

the inequality \F (x) — F (y)| < M ■ YllLi k* “ l/iT holds. 

The class of locally Holder continuous functions with degree a is denoted by 
(M"^). For a function F G (M™) and a compact set K C put 


1 

r 

' m 

1 

H (F; a; K) = sup < 

||F(x)-F(y)|. 


; X, y G F, X / y j 


The following lemma plays a key role in the proof of our main result. 

Lemma 2.1. Assume a function F G 0 < a < 1, has the form 

F (x, y) = h{x) + h (y) - h{x + y), 

where /i : M —)• M fs an arbitrarily behaved funetion. Then there exists a function 
f G (M) such that 

F (x, y) = f{x) + f (y) - f{x + y). 

Proof. Consider the functions 

G (x, y) = F (x, y) - F (0,0), g{x) = h (x) - F (0,0). 

Then the function G (x, y) also belongs to the class (M^) and we have the 

equalities 

G (x, y) = g{x) + g{y) - g{x + y), G (0,0) = 0. (2.1) 

It follows from (2.1) that 

y(0) =y(0)+y(0)-y(0) = G(0,0) = 0 

and for any x G M 

G (x, 0) = y (x) -b y (0) - y (x) = y (0) = 0, (2.2) 

G (0, x) = y (0) -b y (x) - y (x) = y (0) = 0, (2.3) 

G (x, x) = y (x) -b y (x) - y (2x) = 2y (x) - y (2x). (2.4) 
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Since the equality (2.4) is equivalent to the equality 

9{x) = ]^g{2x) + ^G(x,x), 


we can write that 




1 1 


g — = —o(l) + —G ' - +- tG' +... + -G —,nGiV, 

^ \ on / on ^ ^ on \ O ’ 0 / on—1 \ A ^ A 0 \ On ’ on / ’ 


1 1 


We obtain from (2.2) that for any k gN 


= G{;^,^)-G{;^,0) <{;^) ■H{G;a-[-lA] 


2fc’ 2k 




where cq = H (^F;a; [—1,1]^^ It follows from (2.5) and (2.6) that 

|g(i)l I ^ I ^ I 1 cq 

^ \ 2^ / — 2^ 2^ 2^ 2^~^ 2^^ * 2 2^na 


— 2^ 2^naA-l 


'i , 1 1 1 

^ 2«-i "■ 2(”“2 )(o-i) 2(«-i)(«-i) 

< lg(^)l I Cl < 

— 2^71 2 '^^+! — 2 ^^ ’ 


where ci and C 2 are some constants independent of n G N. Consider a regular 
irreducible dyadic fraction ^ G [^, and represent it in the binary system: 


m _ 1 ^ 1 ^ ,1,1 

2*1 2^ 2^+Pi 2*^' 


We obtain from (2.1) that 


_ / 1 1 11 
”•^12^ 2k+Pi 2^+Ps 2”' 


= 9[^ +9 


+ ... + 


2k+ps 2^ 


Or> I ^ ( o/c ’ 


1 1 


2^ ’ 2^+Pi 


+ ... + 


1 1 

2k+Ps 2^ 


= 51^1+5 


+ ••• + <7 




1 1 , , 1,1 

2^ ’ 2^+Pi 2^^^® 2*^ 


1 1 , , 1 , 1 
2k+pi ’ 2fc+P2 2^+Ps 2” 


- ... -G 


1 1 
2fc+Ps ’ 2" 


It follows from (2.2) and (2.3) that for any a > 0 and for any x,y G [—a, a]^ 
\G {x, y)\ = \G {x,y) - G {0,y)\ < H (^F; a; [-a, af'j ■ |x|" , 


(2.9) 
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\G {x, y)\ = \G {x, y) - G {x,0)\ < H (^F; a; [-a, • \y\^ 

Then considering (2.7) and (2.9) we obtain from (2.8) that 


( 2 . 10 ) 


< 


+ 


2k+pi 


+ 


G 


1 


1 


2 ^ ’ 2 ^+Pi 


+ 


G 


+ ... + 


1 


2k+ps 2 *^ 


+ 


G 


+ ... + 

1 


2k+Ps 


1 


2k+pi ’ 2 ^+P 2 


1 

2k+Ps ’ 2^ 

Co 


C 2 


< + 

— 2ka 2ik+Pl)ci 


Co 

-I- — 4- 

2ka 2(k+Pl)<^ 


+ ... + 


Co 


+ ... + 


C3 


+ 


C2 


2k+Ps 2 "' 


+... 


+ 


C2 


2{k+ps)a 2 ”-“ 


< ^ < C3 • — 

— r\hr\ — \ 2 '*^ 


m \ “ 


( 2 . 11 ) 


2{k+Ps)a — 2^0 

where C 3 is some constant. 

Now we construct an auxiliary function g* = g* (x) as follows. For any m G Z 
and n G N set (^) = 5 ^ (^); for a number x having in the binary system the 
representation x = po + ^ + • • • + 2 ^ + • • • set 




( 2 . 12 ) 


We must prove that there exists a finite limit on the right-hand side of (2.12). It 
follows from (2.1), (2.10) and (2.11) that for any n G N and k £ N 


Pi 


Pn 


5 Po + TT + ••• + a:;! - Po + tt + ••• + + ••• + 


Pi 


Pn 


Pn+k \ 
2n+k J 


< 


( Pn+1 


-h ... 


Pn+k \ 


2 n+k J 
< (C3 + Co) • 


Pn+1 


I ] I . Pn Pn+1 

G[po + y + ••• + 


. Pn+k \ 
' 2n+kJ 


... -h 


Pn+k 


< 


C 3 + Co 


(2.13) 


12^+1 ' *** ' I 

We obtain from (2.13) that the sequence {g [po ^ is funda¬ 

mental and, therefore, there exists a finite limit on the right-hand side of ( 2 . 12 ). 
Now we show that for any x, y G M 

G (x, y) = g* (x) -h g* (y) -g*{x + y). (2.14) 

Let in the binary system 


Denote 


, Pi , , Pn , ai , 

=Po + y + ••• + ^ + ■■■,y = go + y + ■ 

Cl Cn 

x + y = ro + - + ... + — + ... 

(n) I Pi I I Pn (n) i ^1 , 

^ =Po + y + ••• + ^ = go + y + 

Cl 


Qn 

+ — + 
2 ^ 


Qn 

+ — + 
2 *^ 


(x -F =ro + ^ + ... + ^ + ..., n G N. 
Considering (2.1) we can write that 

G (xW, = g (x(")) + g (?/(")) - g (x^ + ?/(")) . 


(2.15) 


Passing to the limit in (2.15) and taking into account the continuity of the func¬ 
tion G (x, y), we have 

G (x, y) = g* (x) -F g* (y) - lim g + y'^^A . (2.16) 

n^oo \ / 
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Since + equals to 0 or it follows from (2.1), (2.7) and 

(2.10) that 

g ({x + - g < g (^{x + - x^^'> - y^”)^ 


+ 


G (x^^^ + y^^\ (x + y)(’"^ - x^") - 


< 


+ 




s + » (u a; [- |a!| - |!/| - 1.1.1:1 + \y\ + 1]^) ■ 


It follows that 


lim g = lim g ({x + y)''"^^) = g* (x + y). 

n—^oo \ / n^oo \ / 

Thus from (2.16) we obtain (2.14). 

Now we show that g* G (M). First we prove that for any 5 £ (0,1) 

\g* m<c3-s^- 

Indeed, let 5 = ^ + ... + fe + ... G (0,1). Then it follows from (2.11) that 


(2.17) 


, Pi , . Pn 

9[y + - + ^ 


. ,Pl , , Pn^" 

^ Cs * I — ••• H“ — 

- ' 2 2 ^ 


(2.18) 


Passing to the limit in (2.18) we obtain (2.17). 

It follows from (2.14), (2.17) and (2.10) that for any a > 0, <5 G (0,1) and for 
any x, y G [—a, of, 0 < x — y < S, we have 

\g* ix)-g* (y)| < \g* (x - y)| + |G (y, x - y)| 

< C 3 • |x - y|" + iP (^F; a; [-a, • |x - y|" . 

This means that g* G 
Consider the function 


f{x) = g* (x) + F(0,0). 

Note that / (x) also belongs to Ha°^^ (M) and we have the equality 

F (x, y) = G (x, y) + F (0,0) = g* (x) + g* (y) - g* {x + y) + F (0, 0) 

= fix)+ f{y)- fix+ y). 

The lemma has been proved. 

Now we are ready to prove the following theorem. 

Theorem 2.1. Assume a function F G Ha°^^ 0 < a < 1, has the form 

F (x, y) = hi (aix + 6iy) + /i 2 (a 2 X + h 2 y) + /13 ia^x + hy) , (2.19) 

where the iai,bi), i = 1,2,3, are pairwise linearly independent directions in 
and /ij : M —)■ M, i = 1, 2,3. Then there exist functions fi G (M), i = 1,2, 3, 

such that 

F (a;, y) = fi iaix + hiy) + /2 (a2X + 62P) + h iasx + b^y ). 
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Proof. Since the vectors i = 1,2,3 are pairwise linearly independent, 

we can apply a nonsingular linear transformation S' : —)• of the coordinates 
such that S((ai, 6 i)) = (1,0) and S((a 2 , 62 )) = (0,1). Therefore without loss 
of generality we may assume that (ai, 6 i) = ( 1 , 0 ), ( 02 , 62 ) = ( 0 , 1 ) and herewith 
03 7 ^ 0, 63 7 ^ 0. Thus we prove the theorem if we prove that for any function 


F G (M^) of the form 

F {x, y) = hi (x) + 6-2 (y) + 6,3 ( 03 ^ + bsy) (2.20) 

there exists fi G (M), i = 1, 2, 3, such that 

F(x,y) = fi(x) + / 2 (y) + fs (asx + bsy). (2.21) 

Since 03 7 ^ 0, 63 7 ^ 0, it follows from (2.20) that 

^ ( 5 ’ t) “ (5) (t) + 

=hi +h2(0) + h 3 (x), (2.23) 

F ( 0 , = hi ( 0 ) + h 2 + hs (y). (2.24) 

Consider the function 


if(x,y) = F(^^,o) +f(^ 0 ,^) -hi ( 0 )-h 2 ( 0 ), 

Note that the function H (x,y) belongs to the class (M^) and from (2.22) 

- (2.24) it follows that 

H (x, y) = hs (x) + 6,3 (y) - 6,3 (x + y). 

Applying Lemma 2.1, we obtain that there exists a function / G F[a°^^ (M) such 
that 

H (x, y) = / (x) + / (y) - / (x + y). 

Introduce the following functions 

fi (x) = F (x, 0)-hi (0)-/ (a 3 x), /2 (x) = F (0, x)-h 2 (0)-/ ( 63 X), /s (x) = / (x). 
It is not difficult to see that fi G (R), i = 1, 2, 3, and 

fi (x) + /2 (y) + h (asx + bsy) 

= [F (x, 0) - hi (0) - / (a 3 x)] + [F (0, y) - /12 (0) - / ( 63 y)] + / (a 3 X + bsy) 

= F (x,0) + F (0, y) - hi (0) - ^2 (0) - [/ (osx) + / (bsy) - / (osx + 63 y)] 

= F (x, 0) + F (0, y) - hi (0) - h 2 (0) - H (a^x, bsy) 

= F (x, 0)+F (0, y)-hi ( 0 )-h 2 (0)-[F (x, 0) + F (0, y) - F (x, y) - hi (0) - h 2 (0)] 

= F{x,y). 

Thus we obtain that (2.21) holds. The theorem has been proved. 

Using the multidimensional techniques exploited in our previous paper [2], 
Theorem 2.1 can be proven in a more general case. Since the proof for such a 
generalization is purely technical, we only formulate the final result. 
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Theorem 2.2. Assume we are given r directions a*, i = 1, ■■■,r, in M™'\{0} and 
r — 1 of them are linearly independent. Assume that a function F G (M™), 

0 < a < 1, is of the form 

r 

i=l 

Then F can he represented also in the form 

r 

i=l 

where the funetions fi G (M), i = 1, 
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